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SYSTEMS  WHICH  RECOGNIZE  ALPHANUMERIC  SYMBOLS 
RECORDED  ON  A PAPER  OR  FILM  BEARER 

M.  A.  hnginerr,  Zbigniew  Kedzior 

1.  INTRODUCTION 

Information  handling  is  especially  interested  in  systems 
which  manage  to  read-off  alphanumerical  symbols  directly  from  the 
original  documents.  For  the  giant  quantities  of  data  processed 
through  digital  machines,  autamata  are  necessary  that  read, 
which  would  replace  a human  in  the  process  •eparing  data, 

i.e.,  would  shorten  the  time  of  introducti  data,  as  a result 

of  which  there  are  obtained  additional  economical  effects. 

From  the  point  of  view  of  the  principle  of  operation,  the 
systems  that  read  alphanumeric  characters  can  be  assigned  to 
the  three  following  groups [1]. 

1.  Optical  readers  of  symbols. 

2.  Readers  of  symbols  recorded  in  magnetic  ink. 

3.  Readers  of  symbols  recorded  on  a film  bearer. 

The  greatest  interest  is  enjoyed  by  the  readers  of  group  1, 
called  a systems  of  optical  character  recognition  (OCR).  This  is 
the  most  numerous  group  from  the  point  of  view  of  number  of  types. 
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On  the  markets  of  the  western  European  countries,  of  the  USA, 
and  of  Japan  there  are  currently  available  more  than  50  types 
of  this  kind  of  reader.  Around  30  firms  are  occupied  by  the  pro- 
duction of  this  kind  of  system.  These  systems  read  off  the 
symbols  of  stylized  machine  writing  (typing)  as  well  as  of  ordinary 
typewriters  and  line-printers.  Many  of  them  read  hand  written 
numerical  symbols  in  accordance  with  suitable  rules  (so-called 
hand  printing).  In  Table  1 there  are  offered  the  parameters 
of  several  selected  types  of  OCR  systems  of  the  production  of 
American  firms  (data  from  the  years  1972-1973)-  In  the  USSR 
similarly  there  are  produced  several  types  of  alphanumeric 
symbol  readers,  as  for  instance  the  reader  RUTA  701  and  the 
Sever  (north)  3- 

Besides  the  above  mentioned,  to  group  1 there  can  be  numbered 
readers  of  symbols  completed  by  coding  dashes,  as  well  as  readers 
of  dashes.  These  systems  developed  in  the  period  when  OCR 
technology  was  on  a low  level.  Readers  of  dashes  have  spread 
in  western  European  countries  and  Canada,  however  one  must  hope 
that  in  the  future  the  demand  for  this  type  of  system  will  decrease 
from  the  point  of  view  of  (on  grounds  of)  the  dynamic  development 
of  the  optical  technology  of  recognizing  alphanumeric  symbols. 

The  readers  of  group  2 arose  also  more  or  less  at  the  same 
time  as  the  readers  of  dashes  and  of  symbols  with  coding  dashes. 

In  these  readers  the  technology  of  recording  assures  a greater 
reliability  of  read-off,  but  on  the  other  hand  establishes  very  high 
requirements  relating  to  the  quality  of  printing,  which  makes 
difficult  and  increases  the  costs  of  use.  The  prices  of  commercial 
systems  are  relatively  high  and  oscillate  between  limits  of 
$3^,000  and  $123,000.  The  symbols  of  magnetic  writing  are 
readable  with  difficulty  by  a human.  These  systems  currently 
do  not  enjoy  interest.  Their  traditional  use  is  the  reading 
of  checks  in  banks.  In  recent  years  there  are  not  encountered 
new  types  of  this  kind  of  systems. 
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The  most  modern  systems  are  the  readers  of  group  3.  They 
represent  a high  precision  of  read-off  of  alphanumeric  symbols, 
as  well  as  the  possibility  of  recognizing  graphical  information. 

In  the  range  of  character  recognition  there  is  used  here  the 
same  technology  as  in  the  systems  of  group  1.  The  number  of  types 
of  print  read  is  very  great ; the  speed  of  read-off  is  relatively 
large.  A certain  difficulty  here  is  the  transposing  to  the 
bearing  film,  i.e.,  the  filming  of  documents,  but  on  the  other 
hand  the  information  recorded  in  this  manner  is  simple  to 
preserve.  A factor  of  difficulty  in  their  spreading  is  above 
all  their  very  high  prices.  Only  two  firms  are  known  which 
produce  these  systems:  Compuscan  (first  copy  (model)  produced 
in  the  year  1970)  and  Information  International  (first  copy 
in  the  year  1971). 

Summarizing  the  above  it  is  possible  to  say  that  among  the 
new  and  simultaneously  with  prospects  one  must  count  only 
optical  readers  of  alphanumeric  characters  recorded  on  paper 
and  film  bearers.  For  this  reason  in  the  course  of  the  discus- 
sion below  there  will  be  talk  only  of  this  type  of  systems. 

2.  REASONS  FOR  THE  LITTLE  SPREAD  OF 
READERS 

Factors  which  make  difficult  the  spread  of  OCR  systems  are 
the  high  prices  as  well  as  the  requirements  relating  to  the 
quality  of  the  printing.  In  the  year  1970  there  were  in  use 
around  1000  systems  that  recognized  numerical  and  alphanumerical 
characters.  Their  prices  varied  within  limits  of  from  25  thousand 
to  one  million  dollars.  Moreover  the  prices  of  the  two  mentioned 
types  of  microfilm  system  (without  optics)  amount  to  respectively 
$900,000  compuscan  370  OCR  system,  and  $1.5  million  -Graphix 
Reader  of  the  firm  Information  International.  The  requirements 
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relative  to  the  high  quality  of  the  print  discourage  users. 

For  improvement  of  the  use-parameters  the  manufacturers  use  stylized 
writing  (printing).  Similarly  as  for  readers  of  characters 
recorded  with  magnetic  ink  or  readers  of  characters  with  coding 
dashes,  the  stylized  prints  make  difficult  the  spread  of  the 
character  recognition  systems. 

Users  must  read-off  documents  completed  on  special  writing 
machines.  Meanwhile,  the  manufacturer  are  using,  up  till  today, 
various  types  of  stylized  writing  [2].  In  the  majority  of  readers 
produced  contemporaneously  it  is  necessary  to  use  special  forms 
(to  fill  out)  (including  checking  digits,  read-off  field,  control- 
ling symbols,  or  cancelling  dashes).  During  a change  of  the 
type  of  data  read  it  is  necessary  to  plan  new  types  of  forms.  The 
requirements  relating  to  the  quality  of  paper  also  are  high. 

3.  DIVISION  OF  SYSTEMS  AND  THEIR  USE 

From  the  users  point  of  view  optical  readers  of  characters 
recorded  on  a paper  bearer  are  divided  into: 

a)  document  readers 

b)  page  readers 

c)  readers  of  pages  and  documents 

d)  readers  of  recording  tape. 

Document  readers  are  systems  which  read-off  information  in 
selected  places  (so-called  fields)  of  a form.  The  information 
read-off  contains  predominently  not  more  than  200  characters; 
readoff  fields  are  located  in  one  or  two  lines.  The  parameter 
that  characterizes  the  speed  of  operation  is  the  speed  of  the 
documents  read,  which  depending  on  the  type  of  system  varies 
between  limits  of  60  to  1200  documents/minute.  In  these  systems 
there  is  used  predominantly  read-off  of  numerical  characters. 
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However  page  readers  are  systems  which  read  all  the  charac- 
ters appearing  on  documents.  The  significant  parameter  here  is 
the  speed  of  character  reading  (400-3600  chars. /sec).  Page 
readers  recognize  alphanumeric  characters.  In  this  group  there 
are  encountered  universal  systems  that  have  a capability  of 
recognizing  many  types  of  machine  writing  and  hand  written 
numerical  characters,  with  a possibility  of  reading  off  documents 
of  various  forms  as  well  as  with  many  other  advantages. 

The  above  two  types  of  systems  differ  from  one  another 
basically  in  mechanical  construction.  They  are  exploited  in 
various  spheres. 

i 

For  users  with  a great  range  of  applications  there  are 
produced  universal  systems  (group  C),  which  either  have  two 
transporting  mechanisms  or  very  fast  mechanisms  applied  to  the 
transport  of  small  and  large  documents.  The  readers  of  group 
d read  off  characters  recorded  on  a tape  or  recording  cassettes 
or  of  arithmometers;  they  recognize  only  machine  numerical 
characters  and  some  special  symbols.  They  have  limited 
application.  Hence  the  number  of  types  available  on  the  market 
is  small  (moreover  many  systems  of  groups  a,  b,  and  c are  supplied, 
as  an  option,  with  a tape  transport  mechanism.). 

Readers  of  characters  recorded  on  film  bearers  are  page 
readers.  OCR  systems  are  used  most  numerously  (I  sphere  of 
applications)  for  the  conversion  of  proofs  of  payment  (turnaround 
applications)  such  as: 

. checks  (credit) 

. proofs  of  payment  (receipts)  for  electric  energy  and  gas 

. proofs  of  payment  of  taxes,  of  insurance  collections, 
subscriptions 

. lottery  tickets 
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Another  sphere  of  applications  is  the  conversion  of  documents 
arising  in  the  framework  of  a single  institution  (in-house 
applications),  in  which  there  is  a computing  center  (banks, 
industrial  enterprises,  public  use  institutions). 

In  the  USA,  in  Great  Britain,  and  in  the  Federal  Republic 
of  Germany  there  exist  OCR  service  bureaus.  These  are  centers 
of  preparation  of  data  from  documents  supplied  by  the  clients 
(customers).  The  read  off  data  are  registered  on  magnetic  tapes, 
paper  tapes  or  punched  cards. 


A third  sphere  is  field  applications.  Here  representatives 
of  firms  (agents,  travelling  agents)  fill  out  definite  forms, 
which  are  read  off  by  machine  in  a center. 

4.  POTENTIAL  BUYERS  IN  POLAND 

I 

Some  government  institutions  (statistical  offices,  the 
printing  industry,  the  post  office)  are  interested  in  the  acquisi- 
tion of  automatic  read  off  systems  for  information  recorded  on 
various  kinds  of  documents  and  writings.  Other  institutions  and 
offices,  in  view  of  the  significant  amount  of  information  preserved 
and  processed  at  their  place,  can  be  paid  attention  to  as  potential 
buyers  and  users  of  character  readers. 

In  the  opinion  of  the  author  such  institutions  as  the  patent 
! office  of  the  Polish  People’s  Republic,  printing  (ind.)  and  many 

libraries  must  have  at  their  disposal  systems  that  read  characters 
recorded  on  microfilm.  The  post  office,  many  computing  centers, 
commercial  houses,  and  tourist  bureaus  must  use  direct  read-off 
from  documents.  Moreover  statistical  offices  must  take  advantage 
also  of  document  reading  systems,  as  also  the  information  carried 
on  a film  bearer. 

I 
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It  would  be  necessary  to  mention  still  another  sphere  of 
OCR  applications,  namely  the  exploitation  of  read-off  subsystems 
(see  paragraph  5.1* ) -or  of  complete  character  readers  in  reading 
rooms  for  the  blind. 

5.  PRINCIPLES  OF  CONSTRUCTION  OF  OCR  SYSTEMS 
5.1.  Electronic  Subsystems 

On  Figure  1 there  is  presented  a simplified  schematic  block 
diagram  of  the/a  reader.  The  basic  electronic  systems  of  the 
electronic  character  reader  are:  the  receptor  or  read-off  sub- 
system, as  well  ias  the  classifier  or  recognizer  subsystem. 

In  the  continuation  there  are  discussed  types  of  designs  of  the 
avove  mentioned  subsystems. 


Read-off  Subsystems 


Receptors  of  contemporaneously  built  optical  readers  of 
characters  make  good  exploit  use  of  the  following  read-off 
techniques : 

a.  Relative  motion  of  the  documents  under  a column  or 
retina  of  photoelements, 

b.  Illumination  of  the  points  of  a character  raster  by  a 
narrow  beam  of  non-coherent  light  (beam)  guided  by  a system  of 
moving  mirrors  with  respect  to  illumination  with  the  help  of 

a rotating  shield  with  apertures, 

c.  Illumination  of  the  points  of  a raster  by  a narrow  beam 
of  light  of  a radiating  tube  (lamp)  method  of  the  mobile  spot), 

d.  read-off  with  the  help  of  an  accumulating  tube  (lamp), 

e)  investigation  of  the  contour  of  the  character. 

The  read-off  technique  mentioned  in  point  a)  gives  a high 
read-off  speed  but  is  costly.  A design  of  type  b)  is  among 
the  cheapest,  however  its  read-off  speed  is  not  great. 

The  techniques  a)  and  b)  belong  to  the  most  popular.  The 
method  of  the  moving  spot  makes  possible  the  simplification  of 
the  mechanism,  however  it  is  a very  costly  method,  while  the 
speed  of  read-off  is  average  (for  instance  in  Filco-Ford's  M6000 
reader  it  amounts  to  1250  characters/sec  with  an  alphanumeric 
character  read-off). 

In  the  method  d)  the  speed  of  read-off  depends  on  the  type 
of  tube  used  which  can  be  either  a vidicon  or  a tube  of  the  image 
disector  tube  type.  In  the  first  case  the  speed  is  small  (250- 
500  characters/sec.)  and  in  the  second  average  (2000  characters/s). 
This  method  has  not  spread  widely.  In  readers  currently  for  sale 
there  are  not  found  any  designs  with  tubes  of  the  vidicon  type. 
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However  only  in  one  reader  (type  20/20  of  the  firm  Seean  Optics 
is  there  exploited  an  image  detector  tube  (a  tube  built  above 
all  for  optical  recognition.). 

The  method  e)  gives  an  average  speed  of  read-off.  Above 
all  it  finds  application  in  the  recognition  of  hand-written 
characters . 

Recognizing  Subsystems 

The  methods  of  recognition  used  in  available  readers  on  the 
markets  of  the  western  countries  can  be  assigned  to  two  groups: 

a)  the  method  of  matching  to  standards, 

b)  the  method  of  analysis  of  features. 

The  method  of  fragments  (stroke  analysis)  and  the  method 
of  analysis  of  tangents  to  segments  of  the  contour  (curve  tracing) 
can  be  considered  as  subgroups  of  the  analysis  of  features. 

Method  a)  is  used  for  recognition  of  machine  written  characters. 
There  are  here  attained  the  highest  speeds  of  read-off  (up  to 
3600  character/sec.).  It  is  very  popular.  Its  disadvantage  is 
its  lack  of  the  possibility  of  recognition  of  handwritten 
characters.  The  majority  of  methods  from  the  analysis  of  features 
group  are  applied  to  the  recognition  of  hand-written  characters 
or  in  systems  that  read-off  several  types  of  print.  The  method  of 
fragments  constitutes  an  exception  which  is  not  suited  to  the 
analysis  of  handwritten  characters.  In  several  readers  there  are 
applied  simultaneously  two  methods:  one  for  recognition  of  machine 
characters  (a)  and  the  other  for  handwritten  (b). 

An  Example  of  a Design 

The  below  presented  is  an  abbreviated  description  of  the 
type  H8959  reader  of  the  Hitachi  firm  (Japan)  [2],  a model  of 


which  was  produced  in  the  year  1973*  This  is  a page  reader  which 
recognizes  hand  written  numerical  characters  or  alphanumeric 
machine  characters.  It  constitutes  an  example  of  a simple  and 
simultaneously  modern  design.  The  receptor  of  the  reader  is 
of  the  electromechanical  type,  in  which  there  is  exploited  a 
source  of  laser  light  (laser). 


Set  of  rotating  mirrors, 
direction  a) 


Figure  2.  Kinetic  schematic  diagram  of  the  electromechanical 
subsystem  for  read-off  that  appears  in  the  Hitachi  firm's  H8959 
reader. 


On  Figure  2 there  is  presented  a kinetic  schematic  diagram 
of  the  read-off  subsystem.  A suitable  setting  up  of  hand  mirrors 
in  a system  of  rotating  mirrors  as  well  as  the  selection  of  the 
speed  of  this  system  to  the  time  of  the  oscillations  of  a (single) 
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handmirror  that  performs  an  oscillatory  motion,  cause  a successive 
(in  time)  shifting  of  the  very  narrow  bundle  of  light  emitted  by 
the  laser  (mechanical  scanning),  in  the  very  same  way  as  takes 
place  in  a television  system.  The  little  arrows  on  the  diagram 
make  possible  investigation  of  the  direction  of  the  shift  of  the 
laser  radiation  along  the  document.  The  light  reflected  from  the 
paper  is  received  by  the  photomultiplier  tube  and  then  is  amplified 
and  converted  into  digital  signals. 

The  application  of  a coherent  source  in  the  technique  of 
read-off  gives  many  advantages,  among  which  the  most  important  is 
the  improvement  in  the  signal-to-noise  ratio.  This  makes  possible 
better  distinguishing  of  the  blackened  places  from  the  non 
blackened,  and  also  does  not  require  the  use  of  complicated  multi- 
stage amplifiers.  The  above  design  of  a received  is  distinguished 
by  it's  low  costs.  The  speed  of  read-off  is  not  great  in  view 
of  the  use  of  a mechanical  deviation  (scanning)  system. 

In  a recognition  system  there  are  mutually  distinguished 
two  basic  systems  (sets)  namely: 

the  introductory  conversion  system 

. the  recognition  subsystem  proper 

During  character  recognition  of  handwriting  in  the  intro- 
ductory conversion  system  there  are  realized  among  others  the 
following  functions: 


" normalization  of  the  dimensions 

* completion  of  breaks  in  the  lines  of  a character 

' the  removal  of  darkened  places  not  associated  with 
characters . 

‘ shading  (actually  a process  of  horizontal  averaging). 
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The  above  mentioned  functions  are  carried  out  by  logical 
subsystems.  The  results  of  the  operation  of  these  subsystems 
can  be  illustrated  in  an  example  of  the  processing  of  the  symbol 
"2"  (Figure  3a  and  b). 

The  signals  of  the  shaded  (averaged)  picture  are  introduced 
into  the  recognizing  system  in  which  there  occurs  a division  of 
the  lines  of  the  character  (symbol)  into  sections  with  an  ap- 
proximate direction  coefficient  to  the  tangent  to  the  curve, 
and  it  assigns  a chain  of  code  and  finally  recognizes. 

All  possible  directions  of  the  tangent  to  the  lines  of  the 
character  are  reduced  to  the  eight  directions  of  a wind-rose,  to 
which  there  are  assigned  the  digits  0 to  7.  As  a result  of  the 
operation  of  a special  logical  network,  to  each  recognized 
character  there  gets  assigned  a final  sequence  (or  chain)  of 
digits,  on  the  basis  of  which  the  recognition  of  the  character 
is  done.  The  manner  of  building  of  the  chain  of  code  is  illus- 
trated by  Figure  3 (c  and  d). 


The  above  method  of  recognizing  handwritten  characters 
which  contains  among  other  functions  such  as  the  elimination  of 
disturbances  (noise)  (the  completion  of  breaks,  and  the  ridding 
of  blotches)  as  well  as  well  as  shading  (averaging)  are  modern 
features.  Handwritten  characters  cannot  be  written  arbitrarily. 
This  means  that  a person  filling  out  a document  must  be  familiar 
with  the  rules  of  writing  (Figure  4). 


Do  not  connect  the 
characters  t>o  one  another 

The  line  of  the  charac- 
ters cannot  have  breaks 


The  form  of  the  charac- 
ters must  be  simple  and 
without  "hooks  (tails)" 


Correct 


/ 

0 0 

8t 

>i?i 

7 5 

7 z 

1 0 

Wrong 


8 

6 \ 

7 5 

7 

*4 

>t  0 

Figure  4.  Supplementary  training  for  persons  filling  out  documents 
with  handwriting. 
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a)  Matrix  of  the 
input  signals 


b)  Matrix  of  the  shaded 
(averaged)  signals 


c)  Manner  of  marking  of  d)  Manner  of  coding  the 
directions  characters  being 

recognized 

Figure  3.  Graphic  illustruation  of  the  functions  realized 
in  a recognition  subsystem  for  hand  written  characters  in 
the  H8959  reader. 


For  recognition  of  machine  printed  characters  in  the  H8959 
reader  there  is  applied  the  conventional  method  of  fitting  masks 
(standards ) . 


The  classification  system  accepted  here  recognizes  up  to 
50  different  types  of  characters  simultaneously.  The  selection 
of  the  standards  for  each  type  of  machine  print  is  mounted 
in  the  form  of  one  set.  In  case  of  changing  the  type  of  machine 
print  it  is  necessary  to  change  this  set  for  another. 

Basic  Parameters  of  the  H8959  Reader 
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Basic  Parameters  of  the  H8959  Reader 


1)  Types  of  writing  (print): 

‘numerical  characters  by  hand  calligraphy  as  well  as 
the  symbols  C,  S,  T,  X and  Z 

‘machine  stylized  numerical  symbols  of  the  type  OCR-A, 
OCR-B,  407  (IBM)  and  12F  (Farrington) 

Options 

‘alphanumeric  stylized  characters  OCR-A 
‘numerical  characters  of  various  machine  types  for 

writing 


2)  The  system  reads-off  simultaneously  (i.e.,  from  a 
single  document)  machine  and  hand  numerical  symbols 

3)  Dimensions  of  the  documents  from  145  x 95  to  300  x 220  mm 

4)  Maximum  number  in  a line 

‘of  hand  characters  — 36 
.of  machine  characters  — 72 

5)  Maximum  number  of  lines  on  a document 


6) 


7) 

8) 

punch) : 


‘with  hand  characters  — 25 
.with  machine  characters  — 29 

Maximum  speed  of  read-off 

‘of  hand  characters  — 50  characters/sec 

‘of  machine  characters  — 100  characters/sec 

‘forms  (to  be  filled  out)  — 36  documents/min 

Output  vehicle  for  the  data:  paper  tapes 

Dimensions  of  the  system  in  mm  (without  a paper  tape 
1160  x 630  x 1360 


'weight:  around  360  kg 

‘dimensions  of  the  paper  tape  punch  - 490  x 425  x 3^0  mm 
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9)  The  permissable  temperature  and  humidity  of  the  surround- 
ing area  of  the  system:  5 - 35°  C and  30  - 85%  relative  humidity. 

From  the  above  data  there  results  that  the  parameters  of 
the  system  are  less  than  the  parameters  of  many  readers  of  firms 
such  as  REJ , CDC,  Scan  Data  Corporation,  or  IBM.  The  H8959  system 
has  limited  possibilities  in  the  area  of  the  number  of  characters 
happening  to  be  on  one  document,  in  the  area  of  the  number  of 
types  of  writing  (print),  is  distinguished  also  by  small  speed  of 
read-off.  On  the  other  hand  however,  it  is  a system  of  small 
dimensions  and  according  to  opinions  belongs  among  the  relatively 
cheap . 

5.2.  Mechanisms  of  Moving  Documents 

The  most  complicated  are  the  mechanisms  of  document  readers 
and  of  page  readers.  In  modern  systems  there  are  applied  friction 
feeds,  vacuum  ones  as  well  as  ones  with  conical  wheels.  In  the 
area  of  transport  mechanisms  there  are  used  vacuum  drums.  Modern 
receivers  exploit  drum  mechanisms.  Mechanisms  appearing  in 
contemporaneously  produced  readers  are  relatively  costly.  There 
exists  an  opinion  [1]  that  their  cost  of  production  constitutes 
50%  of  the  costs  of  the  whole  reader. 

6.  OPERATIONS  CARRIED  ON  IN  POLAND 

For  several  years  in  the  Institute  for  Organizations  and 
Direction  of  PAN  (Polish  Academy  of  Science (s))  there  have  been 
carried  on  equally  basic  researches  as  well  as  certain  applied 
operations  in  the  field  of  recognition.  There  is  developing 
here  a biocybernetic  direction  (trend)  which  strives  toward 
the  construction  of  a model  of  an  artificial  neuron  network 
that  simulates  the  human  nervous  system,  as  well  as  a program 
direction  occupied  with  the  seeking  of  general  principles  of 
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Table  1.  List  of  some  types  of  alphanumeric  optical  character 
readers  manufactured  by  AMERICAN  companies. 


bAolC  TAHAMETaHo 

— r 

Number  cf 

Manu-  Document  Keel-  Healing  speed,  characters 

!'ac turer  apeei,  pieces,  min  characters  sec.  on  document 


Types  of  script  read  and  characters 


(Table  continues  reading  left  to  right) 


Data  Output 


Direct i 
compute 


Magnetic 

tape 


Directly  to 
computer 


Magnetic  tape, 
punched  cards, 
paper  tape, 
directly  to 
computer 


Directly  to 
computer 


Magnetic  tape, 
paper  tape, 
punched  cards 


Magnetic  tape, 
paper  tape, 
punched  cards 


Document 
Size,  (mm) 


from  67,  7x76 
to  108x222 


from  66x114 
to  114x229 


from  76x166 
to  227x355 


from  127x178 
to  216x229 


microfilm  tape 
16  or  35  mm 


from  102x82.5 
to  227x355 


from  127x76 


operational  Method 


10 


Column  of  pnotoelements| 
and  matching  of  masks 


Laser  beam  and 
matching  of  masks 


Moving  spot-method 


Moving  spot-method 
and  matching  of  masks 


Moving-spot  method, 
matching  of  masks, 
and  feature  analysis 


Laser  beam  and 
feature  analysis 


Moving  spot  method  and 
feature  analysis 


Year  when 


was  manu- 
factured 


11 


DOLLAR  COST  USA 

System 

System 
with  options 
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1970 


1971 


no  data 


1965 


1970 


1970 


1971 


1)2000 


no  data 


223  390 


D00  000 


900  000 


33  600 


215  000 


6D  560 


61  000 


296  D80 


296  1)80 
no  data 


no  data 


1)1  250 


315  000 


Magnetic  tape 


Moving  spot  method 


1969 


no  data 


recognition  with  the  help  of  simple  operations  realized  through 
an  electronic  digital  machine,  or  a specialized  system.  These 
operations  touch  on  many  disciplines  for  recognizing  objects,  and 
in  this  number,  of  recognizing  alphanumeric  characters  as  well 
as  graphic  information.  Among  others  there  has  been  (is  being) 
worked  on  here  and  put  into  operation,  a model  of  a system  which 
reads  alphanumeric  characters,  of  machine  print  with  a speed  of 
read-off  of  up  to  40  characters/second. 

7.  TENDENCIES  IN  THE  BUILDING  OF  OCR 
SYSTEMS 

For  the  reading  off  of  characters  there  is  beginning  to  be 
used  a source  of  coherent  light.  This  design  assures  above  all 
a large  signal-to-noise  ratio,  of  the  analog  signals  received  by 
the  receptor,  through  which  the  accuracy  of  the  read-off  is  in- 
creased, and  also  it  is  possible  to  use  lower  grades  of  paper 
as  well  as  to  eliminate  darkness  in  the  environment  of  the 
read-off  head.  This  design  is  used  in  commercial  readers  such 
as:  the  Cognitronics  firm's  System  70  (1970),  921  DR  of  the  CDC 

firm,  (1971)  as  well  as  the  reader  described  in  section  5.1 
page  reader  type  H8959  of  the  Hitachi  firm  (1973). 

At  a conference  on  the  OCR  theme,  which  took  place  in  the 
year  1967  in  Delft  (Holland)  there  was  discussed  the  possibility 
of  exploiting  holography  for  character  recognition  [3].  Researches 
in  this  direction  are  (being)  carried  on  in  many  countries. 

However,  up  to  this  time  there  is  a lack  of  information  about 
the  sale  of  readers  that  exploit  the  phenomenon  of  holography. 

Lately  more  and  more  frequently  there  is  used  a system  called 
context  recognition,  in  which  there  is  analyzed  the  type  of 
neighboring  characters  along  with  those  recognized. 
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In  modern  designs  there  is  used  a two-stage  analysis.  First 
of  all  there  are  recognized  the  so-called  small  features  (local) 
and  then  there  is  carried  through  a final  identification  within 
the  framework  of  a group  of  characters  that  have  shapes  that 
are  close.  There  is  used  also  "supplementary  instruction"  in 
the  case  of  the  appearance  of  characters  of  deviated  shapes  or 
even  a certain  learning  carried  out  by  the  user. 

There  have  been  carried  out  operations  on  the  recognition 
of  handwritten  alphabetic  characters,  the  result  of  which  is 
a proposal  to  state  standards  pertaining  to  handwritten  charac- 
ters [5]. 

In  the  face  of  little  demand  due  to  the  high  prices,  the 
manufacturers  have  begun  to  build  significantly  cheaper  systems 
specialized  for  concrete  applications. 

8.  PROPOSALS 

In  the  face  of  the  growth  of  applications  of  computers 
in  recent  years  in  Poland,  there  is  an  increasing  number  of 
potential  purchasers  of  alphanumeric  character  readers.  Since 
systems  of  this  type  are  very  costly  it  would  be  necessary  cur- 
rently to  begin  suitable  preparatory  operations,  as  a result 
of  which  it  would  be  possible  to  obtain  desired  technical 
results  at  the  least  possible  costs.  For  this  purpose  it  would 
be  ijecessary  to  determine  a uniform  cut  of  types  for  all  newly 
bought  typewriters  from  import  and  from  native  production, 
and  especially  of  those  that  will  be  used  for  executing  source 
documents  for  data  conversion. 

It  is  necessary  to  set  up  an  institution  which  would  accept 
for  its  self  the  sphere  of  applications  of  OCR  systems.  This 
niche  among  others  must  take  on  itself  the  matter  of  the  supplying 


FTD-ID(RS)I-1756-76 


17 


of  future  users  of  systems  with  printed  materials  of  native 
production  assuring  a high  quality  of  printing  (forms  for  machine 
read-off,  paper  grades  and  tapes  for  typewriters). 
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